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Abstract 

The geometric hitting set problem is one of the basic geometric combinatorial optimization problems: 
given a set P of points, and a set V of geometric objects in the plane, the goal is to compute a small¬ 
sized subset of P that hits all objects in V. In 1994, Bronniman and Goodrich |5j made an important 
connection of this problem to the size of fundamental combinatorial structures called e-nets, showing that 
small-sized e-nets imply approximation algorithms with correspondingly small approximation ratios. 
Very recently, Agarwal-Pan (2) showed that their scheme can be implemented in near-linear time for 
disks in the plane. Altogether this gives ()(1 )-factor approximation algorithms in 0(n) time for hitting 
sets for disks in the plane. 

This constant factor depends on the sizes of e-nets for disks; unfortunately, the current state-of-the- 
art bounds are large - at least 24/e and most likely larger than 40/e. Thus the approximation factor of 
the Agarwal-Pan algorithm ends up being more than 40. The best lower-bound is 2/e, which follows 
from the Pach-Woeginger construction lf26l for halfspaces in two dimensions. Thus there is a large gap 
between the best-known upper and lower bounds. Besides being of independent interest, finding precise 
bounds is important since this immediately implies an improved linear-time algorithm for the hitting-set 
problem. 

The main goal of this paper is to improve the upper-bound to 13.4/e for disks in the plane. The proof 
is constructive, giving a simple algorithm that uses only Delaunay triangulations. We have implemented 
the algorithm, which is available as a public open-source module. Experimental results show that the 
sizes of e-nets for a variety of data-sets is lower, around 9/e. 


1 Introduction 


The mini mum hitting set problem is one of the most fundamental combinatorial optimization problems: 
given a range space (P, V) consisting of a set P and a set V of subsets of P called the ranges, the task 
is to compute the smallest subset Q C P that has a non-empty intersection with each of the ranges in V. 


This problem is strongly NP-hard. If there are no restrictions on the set system V, then it is known that 
it is NP-hard to approximate the minimum hitting set within a logarithmic factor of the optimal | [28| . The 
problem is NP-complete even for the case where each range has exactly two points since this problem is 
equivalent to the vertex cover problem which is known to be NP-complete ll20llT4l . A natural occurrence 
of the hitting set problem occurs when the range space V is derived from geometry - e.g., given a set P of 
n points in M 2 , and a set V of m triangles containing points of P, compute the minimum-sized subset of 
P that hits all the triangles in V. Unfortunately, for most natural geometric range spaces, computing the 
minimum-sized hitting set remains NP-hard. For example, even the (relatively) simple case where V is a set 
of unit disks in the plane is strongly NP-hard lfl9l . Therefore fast algorithms for computing provably good 
approximate hitting sets for geometric range spaces have been intensively studied for the past three decades 
(e.g., see the two recent PhD theses on this topic ltT2HT3l ). 

The case studied in this paper - hitting sets for disks in the plane - has been the subject of a long line of 
research. The case when all the disks have the same radius is easier, and has been studied in a series of 
works: Calinsecu et al. Q proposed a 108-approximation algorithm, which was subsequently improved by 
Ambhul et al. 0 to 72. Carmi et al. |[8| further improved that to a 38-approximation algorithm, though with 
the running time of 0(n 6 ). Claude et al. flOl were able to achieve a 22-approximation algorithm running in 
time 0(n 6 ). More recently Fraser et al. llT5l presented a 18-approximation algorithm in time 0(n 2 ). 

So far, besides ad-hoc approaches, there are two systematic lines along which all progress on the hitting- 
set problem for geometric ranges has relied on: rounding via e-nets, and local-search. The local-search 
approach starts with any hitting set S C P, and repeatedly decreases the size of S, if possible, by replacing 
k points of S with < k — 1 points of P \ S. Call such an algorithm a /.--local search algorithm. It has been 
shown ll24il that a /.--local search algorithm for the hitting set problem for disks in the plane gives a PTAS. 
Unfortunately the running time of their algorithm to compute a (1 + e)-approximation is 0(n 0(1//e )). Very 
recently Bus et al. @ were able to improve the analysis and algorithm of the local-search approach to design 
a 8-approximation running in time 0(n 2 - 33 ). However, at this moment, a near-linear time algorithm based 
on local-search seems beyond reach. We currently do not even know how to compute the most trivial case, 
namely when k = 1, of local-search in near-linear time: given the set of disks D, and a set of points P, 
compute a minimal hitting set in P of I). 


Rounding via e-nets. Given a range space ( P , V) and a parameter e > 0, an e-net is a subset S C P such 
that D Cl S / 0 for all D e D with I) Pi P\ > en. The famous “e-net theorem” of Haussler and Welzl fT8l 
states that for range spaces with VC-dimension d, there exists an e-net of size 0(d/elogd/e) (this bound 
was later improved to 0(d/e log 1/e), which was shown to be optimal in general |[25l 1271 ). Sometimes, 
weighted versions of the problem are considered in which each p E P has some positive weight associated 
with it so that the total weight of all elements of P is 1. The weight of each range is the sum of the weights of 
the elements in it. The aim is to hit all ranges with weight more than e. The condition of having finite VC- 
dimension is satisfied by many geometric set systems: disks, half-spaces, /c-sided polytopes, r-admissible 
set of regions etc. in W 1 . For certain range spaces, one can even show the existence of e-nets of size 0( 1/e) 
- an important case being for disks in M 2 E71 . 

In 1994, Bronnimann and Goodrich |j5j proved the following interesting connection between the hitting-set 
problem, and e-nets: let (P, V) be a range-space for which we want to compute a minimum hitting set. If one 
can compute an e-net of size c/e for the e-net problem for ( P, V) in polynomial time, then one can compute 
a hitting set of size at most c • Opt for (P,T>), where Opt is the size of the optimal (smallest) hitting set, 
in polynomial time. A shorter, simpler proof was given by Even et al. ifTTl . Both these proofs construct an 
assignment of weights to points in P such that the total weight of each range D G V (i.e., the sum of the 
weights of the points in D) is at least (l/OPT)-th fraction of the total weight. Then a (l/OPT)-net with these 
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weights is a hitting set. Until very recently, the best such rounding algorithms had running times of fI(n 2 ), 
and it had been a long-standing open problem to compute a 0(^-approximation to the hitting-set problem 
for disks in the plane in near-linear time. In a recent break-through, Agarwal-Pan Q presented an algorithm 
that is able to do the required rounding efficiently for a broad set of geometric objects. In particular, they 
are able to get the first near-linear algorithm for computing 0(^-approximations for hitting sets for disks. 


Bounds on e-nets. The result of Agarwal-Pan f2j opens the way, for the first time, for near linear-time 
algorithms for the geometric hitting set problem. The catch is that the approximation factor depends on the 
sizes of e-nets for disks; despite over 7 different proofs of 0(l/e)-sized e-nets for disks, the precise bounds 
are not very encouraging. The paper containing the earliest proof, Matousek el al. ll22l |. was over twenty-two 
years ago and thus summarized their result: 

“Note that in principle the e-net construction presented in this paper can be transformed into a determin¬ 
istic algorithm that runs in polynomial time, 0(n 3 ) at worst. However, we certainly would not advocate 
this algorithm as being practical. We find the resulting constant of proportionality also not particularly 
flattering.” ll22l 

So far, the best constants for the e-nets come from the proofs in lf27l and llT7ll . The latter paper presents five 
proofs for the existence of linear size e-nets for halfspaces in M 3 . The best constant for disks is obtained by 
using their first proof. A lifting of the problem of disks to M 3 gives an e-net problem with lower halfspaces in 
M 3 , for which ifTTl obtains a bound of |/(a) where a < | and f(a) is the best bound on the size of an a-net 
for lower halfspaces in M 3 . Using the lower bound of ll26l for halfspaces in M 2 , f(a) > |"2/of| — 1 > 6, 
although we believe that it is at least 10 since even for e = 1/2, no e-net construction of size less than 10 is 
known. Thus, the best constructions so far give a bound that is at least 24/e and most likely more than 40/e. 
Furthermore, there is no implementation or software solution available that can even compute such e-nets 
efficiently. 


Our Contributions 

We prove new improved bounds on sizes of e-nets and present efficient algorithms to compute such nets. 
Our approach is simple: we will show that modifications to a well-known technique for computing e-nets - 
the sample-and-refine approach of Chazelle-Friedman |(9| - together with additional structural properties of 
Delaunay triangulations in fact results in e-nets of surprisingly low size: 

Theorem 1.1. Given a set P ofn points in M 2 , there exists an e-net under disk ranges of size at most 1 .'1.4/e. 
Furthermore it can be computed in expected time 0(n\ogn). 

A major advantage of Delaunay triangulations is that their behavior has been extensively studied, there are 
many efficient implementations available, and they exhibit good behavior for various real-world data-sets 
as well as random point sets. The algorithm, using CGAL, is furthermore simple to implement. We have 
implemented it, and present the sizes of e-nets for various real-world data-sets; the results indicate that our 
theoretical analysis closely tracks the actual size of the nets. This can additionally be seen as continuing the 
program for better analysis of basic geometric tools; see, e.g., Flar-Peled lfT6l for analysis of algorithms and 
Matousek ll23ll for detailed analysis, both for a related structure called cuttings in the plane. 

Together with the result of Agarwal-Pan, this immediately implies the following: 

Corollary 1.1. For any 8 > 0, one can compute a (13.4 + 5)-approximation to the minimum hitting set for 
(P, V) in time 0(n). 
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2 A near linear time algorithm for computing e-nets for disks in the plane 

Through a more careful analysis, we present an algorithm for computing an e-net of size ]:iA , running in 
near linear time. The method, shown in Algorithm [TJ computes a random sample and then solves certain 
subproblems involving subsets located in pairs of Delaunay disks circumscribing adjacent triangles in the 
Delaunay triangulation of the random sample. The key to improved bounds is i) considering edges in the 
Delaunay triangulation instead of faces in the analysis, and ii ) new improved constructions for large values 
of e. 

Let A (abc) denote the triangle defined by the three points a, b and c. I) n bc denotes the disk through a, b 
and c, while !)„},<■ denotes the halfspace defined by a and b not containing the point c. Let c(P) denote the 
center of the disk P. 

Let E(P) be the Delaunay triangulation of a set of points R C P in the plane. We will use E when R is 
clear from the context. For any triangle A G E, let Pa be the Delaunay disk of A, and let Pa be the set 
of points of P contained in Pa- Similarly, for any edge e G S, let Ag and A * 1 2 be the two triangles in E 
adjacent to e, and P e = P A 1 |J P A 2 . If e is on the convex-hull, then one of the triangles is taken to be the 
halfspace defined by e not containin g R. 


Algorithm 1: Compute e-nets 

Data: Compute e-net, given P: set of n points in M 2 , e > 0 and c\. 


1 if en < 13 then 

2 |_ Return P 

3 Pick each point p G P into R independently with probability C| . 

4 if \R\ < ci/2e then 

5 restart algorithm. 


6 Compute the Delaunay triangulation E of R. 

l for triangles A G E do 

8 Compute the set of points Pa C Pin Delaunay disk Pa of A. 
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10 

11 

12 

13 

14 

15 

16 
17 


for edges e G E do 

Let Ag and A 2 5 6 be the two triangles adjacent to e, P e = P A i UP Ai . 

Let e' = (jfrj) and compute a e'-net R e for P e depending on the cases below: 

if I < e' < 1 then 


if i < e' < I then 


iff 7 8 < j then 

compute recursively. 


compute using Lemma 


2.2 


compute using Lemma 


2.1 


18 Return (U e R e ) U R. 


In order to prove that the algorithm gives the desired result, the following theorems regarding the size of an 
e-net will be useful. Let /(e) be the size of the smallest e-net for any set P of points in M 2 under disk ranges. 

Lemma 2.1 (01). For | < e < 1, /(e) < 2, and can be computed in 0(n log n) time. 

Lemma 2.2. For ^ < e < |, /(e) < 10 and can be computed in 0(n log n) time. 
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Proof. Divide the plane into 4 quadrants with 2 lines, intersecting at a 
point q, such that each quadrant contains n/4 points. Using the Ham- 
Sandwich theorem, this can be done in linear time lf2Tl . Create a f-net 


for each quadrant, using Lemma 2.1 Add these 8 points to the e-net of 



Figure 1: Setup around q. 


P. If q E P then add q to the e-net; otherwise let A be the triangle in 
the Delaunay triangulation of P that contains the point q. Add the two 
vertices of A that are in the opposite quadrants to the e-net. The resulting 

size of the net is at most 10. Denote the quadrant without a vertex of the Delaunay triangle inside it by Q 
and its opposite quadrant by R. If a disk D intersects at most 3 quadrants and does not contain any of the 
points from the |-net in each of those quadrants, it can contain only at most 3 • | j | points. On the 
other hand, if D contains points from each of the 4 quadrants, then it must contain points from Q and R 
that are outside of the Delaunay disk D/\ of A (as D/\ is empty of points of P). Then if D does not contain 
any of the two vertices of A in the opposite quadrants (already added to the e-net), it must pierce Da, a 
contradiction. □ 


Call a tuple ({p, q}, {r, s}), where p, q,r, s E P, a Delaunay quadruple if int(A(pqr )) Cl int(A(pqs )) = 0. 
Define its weight, denoted W({p,g},{r,s})> to be the number of points of P in D pqr U D pqs . Let T<k be a set 
of Delaunay quadruples of P of weight at most k and similarly 7% denotes the set of Delaunay quadruples 
of weight exactly k. Similarly, a Delaunay triple is given by ({/;, q), {r}), where p,q,r E P. Define its 
weight, denoted lly { 7 ,. r/ },{/■[) - to be the number of points of P in D pqr U D pq¥ . Let S<k be a set of Delaunay 
triples of P of weight at most k, and Sk denotes the set of Delaunay triples of weight exactly k. 

One can upper bound the size of T<k, S<k and using it, we derive an upper bound on the expected number 
of sub-problems with a certain number of points. 

Claim 2.3. |7<fc| < (e 3 /9 )nk 3 asymptotically and \T<k\ < (3.1)nk 3 for k > 13. 


Proof. The proof is an application of the Clarkson-Shor technique ff2ll . Pick each point in P independently 
with probability p cs to get a random sample R cs . Count the expected number of edges in the Delaunay 
triangulation of R cs in two ways. On one hand, it is simply less than 3U[|1? CS |] = 3 np cs . On the other hand, 
it is: 

3 np cs > E[Number of Delaunay edges in R cs \ = Pr[{p, q} is a Delaunay edge of R cs ] 

p,q&P 

> ^ ^ Pr [(D pqr U D pqs ) Cl R cs = 0] (disjoint events) 

p,q£P r,s£P 

> £ MV U Dpqs) Rcs — 0] 

({M}>{r>s})e7<fc 

> Y Pcs • (! - Pcs) k = \T<k\ • p\ s ■ (1 - p cs ) k 

({p,'?L( r '>s})eT<fc 

Therefore |7<fc| < 3np cs /(pf. s (l — p cs ) k ) an d a simple calculation gives that setting p cs = k '^, minimizes 
the right hand side. Then |7<fc| < 3n^g/(( j^j) 4 (l — ^^ 3 )^) = n ^ 3 g(l + r) k+3 , and the claim follows. 

□ 


Claim 2.4. |<S<fc| < (e 2 /4)n/c 2 asymptotically and < (2.14 )nk 2 for k > 13. 

Proof. Pick each point in P independently with probability p cs to get a random sample R cs . Count the 
expected number of edges in the Delaunay triangulation of R cs that lie on the boundary of the Delaunay 
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triangulation, i.e., adjacent to exactly one triangle, in two ways. On one hand, it is exactly the number of 
edges in the convex-hull of R cs . therefore at most C [|/(*,,, |i = np cs . Counted another way, it is: 

npcs > E[Number ol' boundary Delaunay edges in R cs ] = Y^ Pr[{p, q} is a boundary Delaunay edge of R cs ] 

p,q&P 

> Y^ Yj Pr [(-^W u D pq r) Cl R cs = 0] (disjoint events) 
p,qeP reP 

> Y Pr [( D pqr U Dpgr) D R cs = 0] 

> Y Pcs'i 1 - Pest = |5<fc| • Pcs ■ (1 - Pest 

({P>9}>M)&S<k 

Setting pcs = Yi gi yes the required result. □ 

Claim 2.5, 


\{e G S | k\en < \P e \ < k 2 en}\ 


< 


(3-l)c? 


ee 


fclCl 


(kfci + 3.7 k%) if en > 13. 


Proof. The crucial observation is that two points {p, q} form an edge in E with two adjacent triangles 
A (pqr), A(pqs) G E iff {p, q , r, s} C R and none of the points of P in D pqr U D pqs are picked in R (i.e, 
the points p, q, r, s form the Delaunay tuple ({p, q), {r, s})). Or {p, q} form an edge on the convex-hull of 
E with one adjacent triangle A (pqr) iff {p. q,r} C R and none of the points of P in D pqr U I) vqr are picked 
in R. 

Let X({pji\J { r\s\) he the random variable that is 1 iff [p. q] form an edge in E and their two adjacent triangles 
are A (pqr) and A (pqs). Let X({p,q},{r}) he the random variable that is 1 iff {p. q} form an edge in S 
with exactly one adjacent triangle A (pqr). Noting that every edge in E must come from either a Delaunay 
quadruple or a Delaunay triple, 

E[|{e | kpen < \P e \ < k 2 en}\} = Y Pr [X({p, g },{r,s}) = 1] + 

p,q,r,sGP 

k 1 en<W {{Ptq}t{r ^ ) <k 2 en 

Y/ Pr [^({p,9},{ ) '}) = 

p,q,r£P 

k 1 en<W({ Ptq y^ r y ) <k2en 


The second term is asymptotically smaller, so we bound it somewhat loosely: 


E 


Pr [X({p,q},{r}) 1] < 


E 


(ci/en) 3 (l — ci/en) w (b>.<d.M) 


p,q,r£P 

k\€n<W({ p ^ q },{ r })<k2en 


p,q,r 

/ci€n<^({ Pj q} ) { r })</c 2 en 


< \S<k 2 en\ ■ (ci/en) 3 (l - Ci/en) kien 


< {2.1f)n{k 2 enY ■ {ci/eny ■ e 


3 -cifci _ (2-14)fc|cf 


Aii 
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Now we carefully bound the first term: 


-f >r [X({p,g},{r',s}) 1] 

p,q,r,s£P 


k2en 



w 

VI 

E 

P r [x({p,(j},{r,s}) 1] 

i=k\en 

p,q, r ,s 



W({p,q},{r,s}) : 

=i 

k2en 



- E 

E 

(ci/en) 4 (l - Ci/en) 1 

i=k\en 

P,q,r,s 



W({p,q},{r,s}) : 

—i 

k2en 




< ^ |7i|(ci/en) 4 (l — ci/en) 1 


i=k\en 


As the above summation is exponentially decreasing as a function of i, it is maximized when |7I 0 | = 
max |7<i 0 | where zq = k\ en, and 177| = max |7<j| — max |7<i-i| and so on. Using Claim 2.3 we obtain: 


< 


< 


< 


< 


k2cn 

\T< kien \-(c 1 /en)\l-c 1 /en) k '™+ £ (\T<i\ - |T<^i|) ■ (ci/en) 4 (l 

i=k\en -\-1 

k2tn 

(3.1)n(fcien) 3 • (ci/en) 4 (l — ci/e?z) fcien + ^ (3.1)n • 3z 2 • (ci/en) 4 (l 

i^fc-Len+1 


(3.1) 


k\c\e~ klC1 

e 


+ (3.1) 


3k% cf 
e 2 n 


k2en 

(1 - ci/en) 1 

i=k\en -\-1 



+ (3.1) 


3fc| c i (1 

e 2 n 


ci/en) kien 
c i / en 


(3.1)cf 

ggfclCl 


(fc 3 ci + 3fc|). 


^/en) 1 

Ci/en) 1 


The proof follows by summing up the two terms. 
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Using the above facts we can prove the main result. 

Lemma 2.6. Algorithm COMPUTE e-NET computes an e-net of expected size 13.4/e. 



Proof. First we show that the algorithm computes an e-net. Take any 
disk D with center c containing en points of P, and not hit by the initial 
random sample R. Increase its radius while keeping its center c fixed until 
it passes through a point, say p\ of R. Now further expand the disk by 
moving c in the direction p/c until its boundary passes through a second 
point p 2 of R. The edge e defined by p\ and p 2 belongs to E, and the two 
extreme disks in the pencil of empty disks through p\ and p 2 are the disks 
D /±i and D& 2 . Their union covers D, and so D contains en points out of 


the set If. Then the net R, computed for P, must hit /+ as en = (en/\P e \) ■ \P e 


For the expected size, clearly, if en < 13 then the returned set is an e-net of size Otherwise we can 
calculate the expected number of points added to the e-net during solving the sub-problems. We simply 
group them by the number of points in them. Set E) = {e | 2*en < \P e \ < 2* +1 en}, and let us denote the 
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size of the e-net returned by our algorithm with /'(e). Then 


E[/'(e)] =E[|i2|]+E[|(Ji2e 

eSS 


j + E[|{e |en< \P e \ < 3en/2}|] • /(2/3) 


+E[|{e | 3en/2 < \P e \ < 2en}|] • /(1/2) 


+E E 


E/' 

e€Ei 



Noting that nEeeE, /'(ifr) I 1^1 = *] < iE[/'(l/2 < + 1 )], we get 


E 


£/' 

eGEi 


en 

\E\ 


= E 


e IE f 


e&Ei 


en 

\E\ 


I Ei 


< E [|f?<| • E[/'(l/2* +1 )]] = E[\Ei\] ■ E[/'(l/2 i+1 )] 


as \Ej\ and /'(•) are independent. As e' = fff > e, by induction, assume E[/' (e')] < Then 
E r / /( e) i < C! | (3.1) • cf (cr + 8.34) ^ (3.1) • cf((3/2) 3 Cl + 14.8) 1Q 

L i ~ e ee ci gg3ci/2 

+ £ (3.1) • cl(2 3i ci + 3.7 • 2 2i+2 ) 13 4 2 , +1 < 13.4 


ee ci 


2* 


by setting c\ = 12. □ 

Finally, we bound the expected running time of the algorithm. 

Lemma 2.7. Algorithm COMPUTE e-NET runs in expected time ()(n log n). 


Proof. Note that E[|i?|] = c\/e. First we bound the expected total size of all the sets P e : 


E 


[Eb 

eSH 


< E[|{e | 0 < \P e \ < en}|] • en + ^E[|{e | 2 l en < \P e \ < 2* +i en}|] • 2* +1 en 

i =0 


< 0(f) + Y.O 


i =0 


(2 


i\3 


ee 


2‘ci 


■ 2 * +1 en = 0(n), 


as the last summation is a geometric series. This implies that the expected total number of incidences 
between points in P, and Delaunay disks in H is 0(n). The Delaunay triangulation of R can be computed 
in expected time 0( 1/elogl/e). Steps 5-6 compute, for each Delaunay disk D e H, the list of points 
contained in D. This can be computed in 0(n log 1/e) time by instead finding, for each p G P. the list of 
Delaunay disks in E containing p, as follows. First do point-location in E to locate the triangle A containing 
p, in expected time 0( logl/e). Clearly Da contains p. Now starting from A, do a breadth-first search 
in the dual planar graph of the Delaunay triangulation to find the maximally connected subset of triangles 
(vertices in the dual graph) whose Delaunay disks contain p. As each vertex in the dual graph has degree at 
most 3, this takes time proportional to the discovered list of triangles, which as shown earlier is 0(n) over 
all p 6 P. The correctness follows from the following: 

Fact 2.8. Given a Delaunay triangulation E on R and any point p G R 2 , the set of triangles in E whose 
Delaunay disks contain pform a connected sub-graph in the dual graph to E. 



Proof. This can be seen by lifting P to M 3 via the Veronese mapping, where it follows from the fact that the 
faces of a convex polyhedron that are visible from any exterior point are connected. □ 

Note that by the e-net theorem, the probability of restarting the algorithm (lines 4-5) at any call is at most a 
constant. Therefore it is re-started expected at most a constant number of times, and so the expected running 
time, denoted by T(n): 

E[T(n)] = 0(l/elog 1/e) + 0(n log 1/e) + ^E[T(|P e |)] < 0(n log 1/e) + £ E[T(|P e |)] 

eSH eGS 


Similarly to previous calculations we have that 


E[T(n)] < 0(?rlog 1/e) + — c t( c i + 8.34) . Q(3en/2 log(3en/2)) 


+ 


ee Cl 

(3.1) • cf((3/2) 3 ci + 14.8) 


ee 


3 Cl /2 


0(2en log(2en)) 


+E MV3^, E|rrall 


i= 1 


ee 


, , (3.1) • c?(2 3i ci + 3.7 • 2 2i+2 ) i+1 ._ 

< dri log n + - pppi -■ E[T(2 +1 en)] 


1=1 


ee ci 


for a constant d coming from the constants above, as well as in Delaunay triangulation, point-location and 
list-construction computations. Setting E[T(&)] = cklogk satisfies the above inequality for c > 2d, since 


E[T(n)] < 


, . , ^ (3.1)-c 3 (2 3i ci + 3.7-2 2i + 2 ) i+1 ,, +1 

dnlogn + } --- - -• c(2 + en) log(2 + en) 

— ee Cli 

1=1 


< 


dn log n + (cn log n) 


2* +1 • (3.1) ■ 12 3 (2 3i • 12 + 3.7 • 2 2i + 2 ) 


,12-2 l 


i=l 


< dn log n + cn log n ■ - < cn log n, for c > 2d. 




3 Implementation and Experiments 

In this section we present experimental results for our algorithm running on a machine equipped with an 
Intel Core i7 870 processor with 4 cores each running at 2.93 GHz and with 16 GB main memory. All our 
implementations are single threaded in order to have a fair comparison. For nearest-neighbors and Delaunay 
triangulations, we use the well-known geometry library CGAL. It computes Delaunay triangulations in 
expected 0{n log n) time. Instead of computing centeipoints, we will recurse for all values of e'\ this results 
in simple efficient code, at the cost of slightly larger constants. 

In order to empirically validate the size of the e-net obtained by our random sampling algorithm we have 
utilized several datasets in Ql. The MOPSI Finland dataset contains 13467 locations of users in Finland. 
The KDDCUP04Bio dataset contains the first 2 dimensions of a protein dataset with 145, 751 entries. The 
Europe and BirchS datasets have 169,308 and 100,000 entries respectively. We have created two random 
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data sets Uniform and Gcmss9 with 50, 000 and 90, 000 points. The former is sampled from a uniform 
distribution while the latter is sampled from 9 different gaussian distributions whose means and covariance 
matrices are randomly generated. Setting the probability for random sampling to results in approximately 
12 sized nets for nearly all datasets, as expected by our analysis. We note however, that in practice setting ci 
to 7 gives smaller size e-nets, of size around |. See Figure[2]for the dependency of the net size on ci while 
setting e to 0.01. In Table [I] we list e-net sizes for different values of e while setting ci to 12. 


Dataset 

e-net size 


e = 0.2 

e = 0.1 

e = 0.01 

e = 0.001 

MOPSI Finland 

83 

128 

1226 

12011 

KDDCUP04Bio 

55 

118 

1176 

11902 

Europe 

69 

119 

1205 

12043 

Birch 3 

58 

125 

1198 

11878 

Uniform 

70 

109 

1245 

12034 

Gauss9 

58 

120 

1275 

12011 


Table 1: e-net sizes for various point sets, ci = 12. 


4 Conclusion 

In this paper we have improved upon the constants in the previous construction of e-nets for disks in the 
plane. Our method gives an efficient practical algorithm for computing such e-nets, which we have imple¬ 
mented and tested on a variety of data-sets. We conclude with a list of open problems: 

• Currently the best known lower-bound is the 2/e bound for halfspaces in M 2 . It remains an interesting 
question to improve this lower-bound, or improve the upper-bounds given in this paper. 

• Currently the algorithm of Agarwal and Pan 0 uses a number of heavy tools (dynamic range report¬ 
ing, dynamic approximate range counting) that hinders an efficient and practical implementation of 
their algorithm. It would be considerable progress to derive a more practical method with provable 
guarantees. 
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